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Recent efforts have applied quantum tomography techniques to the calibration and characterization of com- 
plex quantum detectors using minimal assumptions. In this work we provide detail and insight concerning the 
formalism, the experimental and theoretical challenges and the scope of these tomographical tools. Our fo- 
cus is on the detection of photons with avalanche photodiodes and photon number resolving detectors and our 
approach is to fully characterize the quantum operators describing these detectors with a minimal set of well 
specified assumptions. The formalism is completely general and can be applied to a wide range of detectors. 



INTRODUCTION 

The quantum properties of nature reveal themselves only 
to carefully designed measurement techniques H] [S). In 
addition, most quantum information applications both com- 
putational and cryptographic, rely on a certain knowledge 
of the measurement apparatuses involved. Indeed for these 
protocols we often need to ensure that we associate detector 
outcomes with the correct quantum mechanical operation or 
quantum state. More critically, the assumption of a fully char- 
acterized detector completely underlies both quantum state 
tomography (QST) and quantum process tomography (QPT) 
EHHHl. State tomography has become an important tool for 
characterizing states, partially due to the realization that non- 
classical states are a resource for performing tasks such as 
enhanced precision metrology, quantum communication, and 
quantum computation. Often taken for granted, measurement 
also plays a crucial role in these tasks and in some can even 
eliminate the need for entanglement. In enhanced precision 
metrology, appropriate measurements alone can give rise 
to super-resolution |2| and Heisenberg-limited sensitivity 
|[T]. In communication, measurement allows entanglement 
swapping, and thus, is central to quantum repeaters. And 
for computing, measurement based schemes enable quantum 
computation [6] [71. It follows that measurement should 
also be considered a resource for quantum protocols. In 
QST a given number of measurements on many copies 
of an unknown state reveal its density operator 111 El ID . 
Characterising the operators that govern an evolution or a 
channel - QPT - amounts to applying the process to a set of 
input states, and subsequently fully characterising the output 
states m |5l [lOl HT] [12|- In this paper we study quantum 
detector tomography (QDT) Ull El HSl US I? I, in which 
a detector's outcome statistics in response to a set of input 
states determine the operators that describe that detector. 
State and detector tomography evidently exhibit a dual role: 
Either the input is well-known and the detector is to be 
characterised, or the detector is well-known and the state is to 
be tomographic ally reconstructed. 

Throughout the paper, we focus on examples from optics. 



However, quantum detector tomography is a general concept, 
applicable to any quantum detector. Building upon previous 
theoretical descriptions ifTSl [l4l [T5] [T6l we will introduce the 
concept of detector tomography in the next section. Along- 
side, we will present examples detailing the reconstruction of 
simple detectors to introduce the key concepts. Subsequently, 
based on recent experiments 1 17| we will present the methods 
used in optical detector tomography and the convex optimiza- 
tion methods ifTSi which allow an efficient and simple numer- 
ical optimization. Finally, in the last section, we will discuss 
some of the theoretical and experimental challenges involved 
and how to address them. 

DETECTOR TOMOGRAPHY 
Definitions 

In quantum mechanics, the operator describing a measure- 
ment apparatus is, in its most general form, a positive operator 
valued measure (POVM). The POVM elements {7r„} describe 
the possible outcomes, labeled here by n. Particularly, for a 
projective measurement, the POVM elements are orthogonal 
and simplify to the familiar form 

7r„ = |V'„)(?/'„|. 

In quantum optics, an example of a POVM that consists of 
projectors is that of an eight-port homodyne, {|a)(Q!| : a £ 
€}. 

Now, given an input state p, the probability pp „ of obtain- 
ing output n is 

Vp,n = Tr (p7r„) . (1) 

It follows that the POVM elements must be positive semi- 
definite, 7r„ > 0, while observing 

^7r„ = l, (2) 

n 

if we want probabilities adding up to one. Inverting Eq. Q to 
extract 7r„ subject to the aforementioned conditions is the task 
behind detector tomography. 
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Assumptions 

Detector tomography raises fundamental questions about 
the kind of information we can extract from Nature. It is rea- 
sonable to think that state tomography performed with poorly 
characterized detectors can lead to unwanted errors. In ad- 
dition, for quantum cryptography, a mischaracterization of 
states or detectors may lead to channels through which an 
eavesdropper may attack invalidating certain security proofs 
(see for ex. lfT9ll ). Indeed, if we misjudge the noise level, then 
we misjudge the information the eavesdropper possesses. In- 
terestingly, quantum key distribution with totally uncharacter- 
ized detectors is possible in principle, based entirely on cor- 
relations violating Bell's inequalities, but at the expense of 
having much smaller rates Ii20i i21i . An in depth discussion 
of our assumptions is therefore necessary to avoid unwanted 
errors in our detector estimation. 

Generally, there are going to be assumptions about the in- 
put states produced by the source. In the reconstruction, we 
will often need to assume we can truncate an infinite dimen- 
sional Hilbert space, such as in the case of photon number On 
the detector side, a common assumption will be that the detec- 
tor is memory-less: the previous measurements do not modify 
the result of future measurements. For example, this assump- 
tion fails when detector deadtimes are longer than the time 
between consecutive measurements. All of these assumptions 
need to be tested. 

More generally, we can ask what the working minimal set 
of assumptions happens to be. An assumption-free tomog- 
raphy could use a complete black box approach: prepare a 
collection of unknown states, measure them and try to draw 
some conclusions about both the detector and the states. 
Specifically, we could have some classical controls to prepare 
quantum states {p\} characterized by the index A and some 
classical pointer to indicate the possible outcomes, labeled 
71. Minimizing the set of assumptions would constrain us to 
draw our conclusions exclusively from the joint probability 
distribution {pA,n}- 

To further constrain the problem we can add the stan- 
dard assumptions: An underlying Hilbert space of fixed 
dimension, normalized positive density matrices and positive 
POVM elements {7r„} satisfying X)n=i ""n = However, 
without further assumptions, the relationship between A and 
p's (P ~ 1 parameters is completely unknown, as is that 
between n and {7r„}'s {N — parameters. 

This discussion highlights the inherent difficulties that a 
fully general inference (or tomographic) scheme entails. Rea- 
sonable assumptions are thus needed but the question of gen- 
eral tomography remains an interesting one to be explored. 
In this direction, some progress has been made in self-testing 
maps. In this context states are prepared with classical recipes 
and families of unitary gates are revealed with few assump- 
tions about the quantum states (however known measurements 
in the computational basis are required) I22i . As shown di- 




FIG. 1: It is generally possible to divide an experiment into prepa- 
ration, evolution, and measurement. However, if one comer in the 
above diagram is unknown or missing, then we need assumptions 
about the other two. 

agramatically in Fig. [TJwe can divide any experiment into 
preparation, evolution and measurement. If one of the ele- 
ments of the triad is unknown or missing then we need to pre- 
viously characterize the other two making assumptions in the 
process. 



Practical detector tomography 

In state tomography, one must perform a set of measure- 
ments {iTn} spanning the space of the density operator to 
be reconstructed. If the state is defined on a d-dimensional 
Hilbert space, then it will be fully specified by — 1 real 
parameters (respecting the constraint of unit trace). To fully 
characterize a quantum detector, we need the data obtained 
from measuring input states from a well-characterized source. 
To recover all the POVM elements {7r„} from the measured 
statistics j3p.„ the probe states or input states must also be cho- 
sen to form a set {pj} that is tomographically complete: the 
span of the operators {pj} - which are not necessarily lin- 
early independent - must be the entire space from which 7r„ 
is taken. A spanning set forming an operator basis will have 
at least (P{k ~ 1) elements for a k outcome detector In prin- 
ciple this is sufficient to calculate the direct inversion of Eq. 
([T]i. However experimental detector tomography carries addi- 
tional requirements. Clearly the probe states should be pre- 
viously characterised, and large numbers of them should be 
easily and reliably generated. In the case of optical detectors, 
coherent states are ideal candidates since a laser can generate 
them directly and we can create a tomographically complete 
set by transforming their amplitude through attenuation (for 
example with a beam splitter) and a phase-shifter (an optical 
path delay). Using input states {|q;)(q;| : a G (D} one can 
then reconstruct the Q-function of the detector ifTSll which is 
simply proportional to the measured statistics, 

Pa,n = ^(a|7r„|a) = -(5„(a). (3) 
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Since Q„(a) of each POVM contains the same information as 
the element 7r„ itself, predictions of the detection probabilities 
for arbitrary input states can then be calculated directly from 
the Q-function. 



Simple example: the perfect photocounter 

Consider as a simple example the case of a perfect photon 
number detector This projective measurement is character- 
ized by its POVM elements, {7r„ ^ \n){n\ : n = 0, . . . , N}. 
In the simplest of scenarios, using pure number states, 
{p = \m){m\ : m = 0, . . . , A^}, the characterization would 
be trivial, since the statistics 



jP(a + (5 n) Probability Distribution for a perfect 8-outcome detector 



0.9, 

0.8 

0.7- 

0.6 

0.5 

0.4- 

0.3 

0.2- 

0.1 







10 



Id 



12 



Pp.;, 



= Sr, 



would immediately characterize our detector. From a more 
practical perspective, pure number states are very hard to gen- 
erate, especially for high photon numbers. Using coherent 
states is therefore a more realistic approach. Assuming a col- 
lection {\ai) : i = 1, . . . , D} of perfect coherent states our 
statistics would then become 

Pai,7i = Tr {\ai) {ai\n) {n\) 



-ia.iV2 m 



2n 



Of course in an experiment we would not know the POVM 
elements in advance, and we would need to express them with 
a generic expression such as 

suffering only the constraint < 7r„ < 1. Fortunately, 
these operators can be simplified: interposing a phase shifter 
in the coherent beam's path we can check if the statistics 
are independent of the phase. If they are we can infer the 
phase independence of the POVM. Our operators then become 
TTn = J2k ^fe"' I ^) (^1 ™d the statistics can be expressed as 



oo 
fc=0 



|2fe 



fc! 



q(n) 



(4) 



For a perfect photon number detector that can discrimi- 
nate up to eight photons, the outcome probability distributions 
would be the one shown in Fig.|2] 

We now consider how one would estimate the POVM ele- 
ments from these probability distributions, which form a set of 
simple simulated data. Our goal is to invert Eq. (j4]). To do so 
we can write a matrix version of the equation. Given the set 
of coherent states {ai, . . . , an}, and truncating the number 
states at a sufficiently large M, we can write 



P = FU. 



(5) 



FIG. 2: Outcome probability distributions for an 8-outcome detector. 
Each curve represents the probability of that outcome (zero clicks, 
one click, etc) happening vs. the value of the intensity of the coherent 
state arriving at the detector. 



Here, we have taken 

PaiM ={at\TTn\ai) 
/ } 

= {a,\(Y,ei''^\k){k\)la,) 



M 



fc=0 
M 

E 

fe=0 



M 



= J2Fk{a^)e^ 

k=a 

M 

= E Fi,k^k 



fc! 



in) 
k 



fc=0 

So the matrix P has entries 



F entries Fi_k — Fk{ai), and 11 entries 11^ ,„ = fl^"-*. For 
an A^-outcome detector, this gives the matrix dimensions of 

p e F e (c^^M^ andn e (D^^^^. 

Now, to obtain 11, we can simply solve the convex opti- 
mization problem 



min{||P-^n||2}, 



N 



subject to TTn > 0, 7r„ 1. 



(6) 



n=l 



This is a convex problem, because the norm ||.||2, defined as 
ll^lb = Cl2i j for matrix A is a convex function 

and the positivity constraint 7r„ > is semi-definite. The 
result of a single such minimization is shown in Fig. [3] where 
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we recover the expected POVM elements 
||0)(0U1)(1|,,...,.,|7)(7|,1- J2 

It is remarkable that even introducing some statistical noise 
in the simulated data the results are just as perfect. In- 
deed, if instead of using Pq.,„ = Tr {\ai) {ai\n) {n\) we use, 
Pai,n = Tr {\ai + Si){ai + Si\n){n\) where {5i} represent a 
2% random noise, then the results are just as robust as with- 
out. This will be discussed in more detail in later sections as it 
relates to the technical noise of the laser. Let us then move on 
to more realistic detectors and see the problems that loss and 
finite photon resolution introduce. 

Reconstructed POVM, |n>(n| 




FIG. 3: Reconstructed POVM of a theoretical photon number detec- 
tor with 9 bins. The POVM elements, tv„ = J2k ^fc"^l'=)(fc|' the 
result of the optimization in Eq. |6](, and show a perfect result even 
though a 2% random noise was added to the data (changing the value 
of the coherent state amplitudes). 



WHY DETECTOR TOMOGRAPHY? 

In spite of its first sucessful applications, one could ques- 
tion the need for detector tomography. After all, detectors 
have been calibrated in the past without tomographic tech- 
niques. However, as quantum technology makes striking ad- 
vances, quantum detectors are becoming more complex and, 
thus, susceptible to imperfections that are not incorporated in 
the bottom-up approach of modelling. In contrast, with to- 
mography one can design detectors with a top-down aproach 
by fully characterizing the final detector operation. For ex- 
ample, photo-detection has seen the advent of single-carbon- 
nanotube detectors L23J . charge integration photon detectors 
(CIPD) 1241, Visible Light Photon Counters (VLPC) ||25l . 
quantum dot arrays ll26l . superconducting edge and picosec- 
ond sensors ll27l |281 or time multiplexing detectors based on 



commercial Si-APDs ll29l l30l . Certainly understanding and 
modelling in full detail the noise, loss and coherence char- 
acteristics of these technologies is far from trivial. Detec- 
tor tomography is an answer to those challenges and, ad- 
ditionally, will allow us to benchmark competing detectors. 
Such characterized detectors also allow for the preparation 
of non-Gaussian states in a certified manner, such as photon 
subtracted states. Hence, such detectors are readily useable 
in protocols such as entanglement distillation based on non- 
Gaussian states ISTl l32l l33l . in schemes increasing entan- 
glement by means of photon subtraction ||34J . enhancing the 
teleportation fidelity ll35l[36l . or in other applications of non- 
Gaussian states 1371 [38l . specifically in the context of quan- 
tum metrology. 

Let us discuss more precisely how detector tomography can 
provide an advantage with respect to traditional calibration 
methods. 



The limits of calibration 

Standard calibration methods require a model of the detec- 
tor The parameters of this model are then estimated experi- 
mentally using states and standard assumptions. However, for 
specific applications, this models can become very compli- 
cated with a daunting number of parameters. For example, a 
detector as simple as a Yes/No avalanche photo diode (APD) 
becomes very hard to model when all noise sources are stud- 
ied |39, 40J. Tomography sidesteps this by enabling the op- 
erational detector to be measured directly. Another advantage 
can be to avoid errors or pitfalls from standard calibration. 
An example is the use of Bell-state detectors which can ap- 
pear to be working while frequency correlations in the input 
states obscure the results BT[ |42 | . To avoid such pitfalls detec- 
tors are often used in the rages where their behaviour is easily 
calibrated. Detector tomography could extend the range of 
applicability of existing detectors and help design more com- 
plicated ones. Let us now look at other advantages of full 
detector tomography. 

Quantitative entanglement verification 

Once a detector is fully characterized, it can be used to char- 
acterize states in a certified fashion. In this respect, a detector 
is still useful if it is imperfect in the sense that its POVM el- 
ements are not just unit rank projections: One can use them 
in an estimation problem, or in a setting that unambiguously 
estimates properties of a state. This is one of the key applica- 
tions of detector tomography, in that imperfect devices can be 
used to very reliably perform estimation. A specifically im- 
portant example is the direct detection of entanglement: Once 
detectors are characterized, one can perform measurements 
and then find lower bounds to entanglement measures, based 
on these measurements. This is an idea presented in Ref. |[47l 
(see also Refs. gSl |49l |50l |5l]). Here we discuss the spe- 
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cific issues related to infinite-dimensional systems and photon 
counters. 

Assume that we have performed more than one type of mea- 
surement, labeled by k, for which we have completely charac- 
terized the POVM elements {ttI'^''} to great accuracy. Using 
two such devices one can make local measurements on each 
part of a bipartite state. With the data from 

one can then ask for the minimal degree of entanglement con- 
sistent with it. This approach does not make any assumptions 
on the preparation of states, and works even for detectors with 
a low detection efficiency; This will already be incorporated 
in the stated bound. The unambiguously minimal degree of 
entanglement, say in terms of the negativity, is then given by 
the solution of the optimization problem 1471 : 

minENip) = ||p^||i - 1, 

subject to Tr ((4'=) ® 7rW)p) = di^^^J,}, 

Tr (p) = 1, p > 0. 

One easily finds lower bounds to the optimal solution to this 
problem by considering 

such that ||P|| < 1. Then a lower bound is readily given by 

m 

IIp'^IIi >Tr (p^p) =Tr {pP^) 

The optimal lower bound, based on such an approach, is in 
turn the solution of the convex optimization problem 1,1 8J in 
a'n^m and (3 given by 

max 

71, m.k. I 

subject to - 1 < "i'm^^" ® i^iY + /31 < 1. 

Some conditions need to be examined to make sure that the 
solution to this formulation (dual optimal) coincides with the 
solution of the original formulation (primal optimal). For lin- 
ear programs or for semi-definite problems (SDP) as the one 
described these conditions are easily satisfied 1 18]. As we will 
see this type of optimisation method is also useful for detector 
tomography itself. 

Here, if one just makes use of POVM elements with a fi- 
nite support, as one has in photon counting with an additional 
phase reference, then such bounds will provide very strong 
lower bounds. However, it will not provide good bounds for 
unbounded operators such as in homodyning. Hence, without 
having detectors of high efficiency, and without assumptions 
on the preparation of the entangled state, one can use a charac- 
terized detector to certify entanglement in quantitative terms. 



MODELLING PHOTODETECTORS 

As we have seen, the aim of detector tomography is to 
identify the physical POVM closest to the experimental data 
with minimal assumptions on the functioning of the detectors. 
To compare this method with a more traditional calibration 
method let us study a photodetector modelling example. We 
will do so for an avalanche photodiode and a photon number 
resolving detector able to detect up to 8 photons ll29l . In the 
next section we will compare these models to the experimen- 
tal results derived without any model. 

Optical photon number detectors 

An important detector in quantum optics is the single- 
photon counting module based on a silicon avalanche photo- 
diode (APD). It has two detection outcomes, either registering 
an electronic pulse (1-click) or not (0-clicks). A loss-free per- 
fect version of it would implement the Kraus operators 

{|0)(0|,1-|0)(0|}, (8) 

distinguishing between the presence or absence of photons. 
However some photons are absorbed without triggering a 
pulse. This loss can be modelled placing a BS in front of 
the perfect detector |52|. The POVMs describing a detector 
with a BS of transmittivity r/ can then be written as, 

oo 

NO CLICK : TTo = ^ (1 - ?7)" \n) {n\ , (9) 

n=0 

oo 

CLICK : TTi = 1 - ^ (1 - ?7)" \n) {n\ . (10) 

n=0 

disregarding after-pulsing or dark counts ||53]| . Having only 
two outcomes, this detector cannot distinguish the number of 
photons present. 

A more advanced detector called time multiplexing detec- 
tor (TMD) does have certain photon-number resolution. It 
splits the incoming pulse into many temporal bins, making 
unlikely the presence of more than one photon per bin. All 
the time bins are then detected with two APDs. Summing 
the number of 1-click outcomes from all the bins one can 
then estimate the probability of having detected a number 
of incoming photons. This detector is not commercially 
available but one has been constructed by the Ultrafast Group 
in Oxford |29 |. It has eight bins in total (four time bins in 
each of two output fibres) and thus nine outcomes - from zero 
to eight clicks. 

The theoretical description of this detector is a bit more in- 
volved since there is what we call the "binning problem". In- 
deed, in addition to loss there is a certain probability that all 
photons will end up in a single time bin, or more generally 
that k incoming photons will result in less than k clicks. To 
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account for the details describing these probabilities we use a 
recursive relation |f54','55 1. Our goal is to describe the follow- 
ing probability distribution: 

p^{j/k): Probability of having j-clicks given that there 
were k incident photons and that the detector has A^-bins (or 
modes). Let us start with the simplest possible TMD which 
would consist of an input port, a beam splitter (with reflectiv- 
ity and transmittivity R and T) and two YES/NO detectors. 
This detector is shown in Fig.|4]and has two bins. 




FIG. 4: Diagram of a simplified 2-bin multiplexing detector. 

For this simple example, the probability distribution we are 
after is p'^{j/k). We will show how to calculate p'^{j/k), 
p'^{j/k) and then how to go from p^ {] /k) to p^^ {j /k). For 
the two bin case from Fig.|4]consider a BS with transmittivity 
T and reflectivity R. In that case: 

• 0) = Sj^o (if no photons are present we will only 
register zero clicks). 

• p'^il, k) = T^+R^' (with probability T'^, A: photons end 
up in the lower bin and the same holds for the upper bin 
with R^ . The probability of a single click is the sum of 
these independent probabilities). 

• p2(2, k) = l-T'' + R'' (if fc 7^ then only two events 
may happen: one click or two. This complementary 
event has therefore P = 1— (Probability of 1 click).) 

In the case of a 4-bin detector shown in Fig.|5] k incoming 
photons are distributed to two 2-bin detectors according to a 
binomial distribution. 

Now let us evaluate the probability for the upper 2-bin de- 
tector to register s counts if x photons entered while register- 
ing m clicks in the lower 2-bin detector if fc — a; entered the 
lower port. This should be p^(s, x)p'^{m, k — x) weighted by 
the probability that x photons enter the upper branch and k — x 
the lower one, which is 

Now the probability that j counts are found overall is found 
summing the weighted probability over all possible ways that 
the detectors can find j counts (i.e., m + s = j) and summing 
over all possible ways of distributing k photons: 

x—Q m+s—j ^ ^ 




FIG. 5: Diagram of a simplified 4 bin multiplexing detector. The first 
beam splitter distributes k photons according to a binomial distribu- 
tion between the two 2-bin loopies of the second stage. 




FIG. 6: Diagram of a 27V-bin multiplexing detector. The first beam 
splitter distributes k photons according to a binomial distribution be- 
tween the two next A'^-bin stages. 

We can extend the same argument to 2N. Imagine we know 
p^ (j/k). Now, for that detector to become a 2A^-bin detector 
all we need is to couple two of them to a beam splitter which 
will distribute the k photons as described above and as shown 
in Fig. [6] In that same fashion we can then define the recursive 
relation 



(11) 

This of course can be generalized to accommodate a differ- 
ent BS reflectivity at each node (adding an index to T and R to 
account for its position). Based on this recursion and once we 
determine all T and R, we can write a simple and efficient pro- 
gram to generate the corresponding theoretical POVMs. For 
example, a 6-outcome detector would have a POVM which 
can be captured in the following matrix: 
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where B 






J = 0, 


. . , 5 and k - 


= 0,. 



For example, the 5-click event has a POVM element. 



TTs ~ 0.2|5)(5| + 0.4|6)(6| + 0.6|7)(7| + 0.8|8)(8|, 

etc. More generally, the measured statistics are related to the 
incoming photons by 

k 

where pj is the probability of detecting j counts and qk the 
probability that k photons arrived to the TMD ||30l. The Ma- 
trix B introduced above then relates probabilities and density 
matrices through: p = B • p. 

Detector loss 

TMD detectors have various sources of loss, meaning that 
photons are absorbed before triggering a detection event. The 
major sources of loss are the coupling to the fibres, the ab- 
sorption and scattering in the delay fibres and the non-unit ef- 
ficiency of the detectors |56|. A full description of the effect 
of losses is certainly complex, since loss occurs at many stages 
of the detector. One can model loss simply as a beamsplitter 
diverting photons out of an input state. However, one would 
have to include a BS before the detector (fibre coupling), a BS 
at each stage of fibre and a BS in front of each APD, altering 
Eq. ( [TT] i accordingly. Instead we give an effective description 
modeling loss with a single BS in front of the detector The 
matrix capturing the losses has entries that are given by 

being the binomial distribution accounting for loss since 
Lk',k = for all k < k'. Now combining both descriptions, 
we can decouple the loss from the binning, putting a BS cou- 
pled to the environment before the iV-bin TMD resulting in 

Pj = ^P^{j/k')Lk',k Qk- 

k,k' 



This relationship expresses how the incoming photons experi- 
ence loss and then are distributed among the available modes. 
The model of the TMD described up to this point would be 
the one needed without detector tomography. One could for 
example try to measure independently the transmittivities of 
the inner BS, reconstruct the convolution (or binning) matrix 
B and try to estimate the overall loss to include it in the ma- 
trix L. This would help us build a model of the TMD sketched 
in Fig. [7] By contrast, using detector tomography, we do not 
need to know anything about bins, beam splitters, inner de- 
tectors or specific loss mechanisms. Moreover, anything left 
out of our detector model (e.g. dark counts, afterpulsing, etc.) 
would be included in a tomographic characterization. 




FIG. 7: Diagram of a simplified 8 bin time multiplexing detector 
(TMD). The spirals represent a delay in the optical fibre. 



EXPERIMENTAL RECONSTRUCTION 

We now turn to the description of the experimental reali- 
sation. As mentioned earlier, coherent states are ideal probe 
states with which to characterize optical detectors. This holds 
true for any optical detector, including polarization detectors, 
frequency detectors (e.g. spectrometers), and even detectors 
that discriminate inherently quantum states (e.g. a photonic 
Bell-state detector). In most of these cases, we are only inter- 
ested in the detector's action at a particular input photon num- 
ber n (usually n — 1). Still, we can reconstruct the detector 
POVM in the full photon number basis and then restrict our 
attention to a particular subspace. It is interesting that an op- 
tical state which can be fully described in a classical theory of 
electromagnetism can be used to characterize uniquely quan- 
tum detectors. To characterize a completely unknown detector 
(i.e. a black box) one would need to vary all the available pa- 
rameters of the probe set of coherent states: spatial-temporal 
mode, polarization, phase, and amplitude (ensuring a tomo- 
graphically complete set of states is constructed). However, 
given that frequency, time, position, momentum, and ampli- 
tude have infinite range, this is impossible in practice. Con- 
sequently, one is required to make realistic assumptions about 
the range of operation and sensitivity of any unknown detec- 
tor One might additionally neglect some of these optical pa- 
rameters if one is only interested in a particular aspect of the 
full characterization. 

The subjects of our detector tomography, the APD and 
TMD, possess the unique features of single-photon sensitivity 
and photon-number resolution, respectively. To characterize 
these features we vary only the amplitude and phase of probe 
coherent states, while fixing the spatial-temporal mode. In 
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particular, the input wavepackets have a time extent shorter 
than the time window of the electronics, and the center 
wavelength is within range of the detectors. Light is coupled 
to both types of detector through single-mode optical fiber, 
eliminating the possibility of any variation in the position or 
momentum mode of the light. Critically, for the detectors to 
perform as intended and in order to ensure the detectors are 
memoryless, the wavepackets must be preceded and followed 
by time intervals in which there is no input light. The APD 
is known to have a deadtime of roughly 50 ns; a detection 
that occurs before the input wavepacket arrives at the detector 
will make the detector insensitive to the probe coherent state. 
The TMD splits the incoming wavepacket into time bins 
spread over 500 ns. The inverse of these two timescales then 
sets an upper limit on the rate at which we can send probe 
states to the detectors. We further limit the rate to ensure the 
detectors do not heat up, which would change their properties 
over time. These time variant features of the detectors could 
be illuminated with detector tomography but this would 
be unnecessarily complicated, as they are quickly evident 
without use of the full tomography procedure. 

When operated with a gain high above their lasing threshold, 
lasers produce light well approximated by a coherent state. 
Coherent states (and statistical mixtures thereof) are unique 
amongst pure optical states in that, at a beamsplitter, the 
transmitted and reflected states are unentangled. Conse- 
quently, through attenuation we can vary the amplitude of a 
coherent state without changing its nature. In homodyne state 
tomography, one must use a balanced detection technique to 
negate technical noise in the laser In contrast, by attenuating 
the laser light to the single-photon level in our detector 
tomography scheme, the resulting coherent state possesses 
an inherent amplitude uncertainty that renders the technical 
noise insignificant. 
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FIG. 8: Experimental setup: The amplitude of the probe coherent 
state is attenuated with a half-waveplate (A/ 2) and a Glan-Thompson 
polarizer (P). The light is then further attenuated by Frequency inde- 
pendent filters or Neutral Density Filters (NDF) and coupled into a 
fibre (FC) 

We use a cavity dumper (APE Cavity Dumper Kit) to re- 
duce the repetition rate of a pulsed mode-locked Ti:sapph 
laser to R. Long term drift of the laser power over 1 mil- 
lion pulses was < 0.5%. Our laser randomly varies in en- 
ergy between subsequent pulses with a standard deviation of 
L88% ± 0.02% of lal^ We vary the amplitude a of our 
probe coherent states by rotating their polarization with a half- 



waveplate (A/2) in front of a Glan-Thompson polarizer (?) as 
shown in Fig. [8] We attenuate the coherent states by reflect- 
ing them from a beamsplitter (BS) {T=95%) and three neutral 
density filters (NDF) (i.e. spectrally flat attenuators). Note 
that R/T for the BS was measured with a relative deviation 
smaller than 1%. Along with some loss upon coupling into a 
single-mode fiber, we collect these attenuations together in an 
overall attenuation factor 7. We test for any variation in 7 as 
a function of a (e.g. which might be caused by rotating the 
waveplate if it had a wedge) and find that the variation is less 
than 1%. 

There are, as of yet, no direct techniques to calibrate the 
power of light at the single-photon level |37 |. In fact, there 
are no laboratory methods to make an absolute calibration of 
power at any intensity. Instead, a chain of photo-detectors 
are calibrated relative to each other. At the beginning of 
the chain is an absolute calibration system held at national 
standards institutes. In the case of our power meter (Coher- 
ent FieldMaxII-TO), the National Institute for Standards and 
Technology (NIST) uses a cryogenic bolometer as its abso- 
lute standard. This chain results in a 5% systematic uncer- 
tainty in our measurements of the laser power P (averaged 
over 0.2s), measured at the transmitted port of the beamsplit- 
ter. The magnitude of a for the probe state was found via 
|a|^ — jPX/{Rhc). For each value of a we recorded the 
number of times each detection outcome occurred in J tri- 
als (i.e. laser pulses), which provides an estimate of Pa,n- The 
phase of a was allowed to drift freely, during which no change 
in the „ was observed. Consequently, we did not actively 
vary the phase of our probe states. 

The 5% uncertainty in P is the dominant error in our ex- 
periment. For a detector with over 95% efficiency, this er- 
ror could result in estimates of {pa.ji} that are incompatible 
with any physical detector. For example, this could result in 
more detector clicks on average than there were photons in 
the probe state on average. Gain in the detector could achieve 
this, but at the same time would necessarily introduce noise 
that would change the distribution of Pa.n- For detectors with 
lower efficiency, the systematic error in P will simply add or 
subtract from efficiency of the detector that results from the 
tomographic characterization. 

We choose a center wavelength A and a FWHM bandwidth 
of AA that are appropriate for each detector In the case of 
the APD detector (a Perkin Elmer SPCM-AQR-13-FC) we set 
A = 780 ± 1 nm, AA = 20 nm, and chose the appropriate rate 
R = 1.4975 ± 0.0005 kHz, J = 1472967, and 7 = (5.66 ± 
0.08) X 10^^. For the TMD detector we set A = 789 ± 1 nm, 
AA = 26 nm, R = 76.169 ± 0.001 kHz, J = 38084, and 
7= (8.51 ±0.11) X 10^9. 

Since \a\^ has an infinite range, one must set an upper limit 
on the magnitude of a used in our set of probe states. A natu- 
ral place for this is at the a at which the detector behavior sat- 
urates, i.e. pa^n stays constant as a function of a. Since this 
occurs asymptotically, a somewhat arbitrary degree of con- 
stancy must be chosen; we set |q;|^ = 120 as our upper limit. 
We expected the saturation behavior of the TMD would be 
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Pa,n ~ 100%,whereas we found that it was pa.s = 93.3% 
andpo, 7 = 6.7%. Already, the measured statistics (which are 
proportional to Qn{<^)) give a clear signature that our theoret- 
ical model of the detector is too simplistic. Moreover, as we 
increase the a beyond our upper limit g begins to decrease. 
The cause for this behavior is a break down of our memoryless 
detector assumption and highlights how crucial it is to create 
probe states in the desired spatial-temporal mode. Along with 
the probe laser pulse, the cavity dumper also out-couples a 
small fraction of the preceding and following pulses in the 
Ti:Sapph cavity. These are separated in time from our probe 
pulse by 13 ns and each contain only 0.17% the energy of the 
probe pulse. Consequently, these extra pulses will have an 
insignificant effect on pa.n for most of the range of a. How- 
ever, at \a\^ — 120, the preceding pulse will be a coherent 
state with \a\ =0.2. Consequently, roughly 20% of the time 
bins in the TMD will be preceded by a photon. If the detector 
has, for instance, a 50% efficiency then 10% of the time bins 
will be preceded by a detection event that will not be counted 
as click by the electronics (due to their time window). More 
importantly, those bins will subsequently be unavailable to de- 
tect photons in our probe pulse, due to the 50 ns deadtime of 
the APDs inside the TMD. Thus, as we observe, roughly 10% 
of the bins will not result in a click. This behavior was extra- 
neous to the normal operation of the detector and so we limit 
I a I ^ to 30 in the tomographic reconstruction. However, it ex- 
emplifies the usefulness of even the basic detector tomography 
procedure, which results in approximate Qn{c(), for rough 
evaluation of the detector action. Some of these hypothesis or 
details could be further explored if we knew well some detec- 
tors or some states. For example the response of BS and neu- 
tral density filters to single photons could be explored (granted 
good single photons and reliable single photon detectors). The 
time independence of the POVMs could also be studied with 
well known states. However, we will see that the excellent fit 
of the data to the Q-function and of the reconstruction to the 
model suggests a sufficiently good understanding. 
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FIG. 9: This plot presents the probability distributions corresponding 
to two detectors. One with negative "POVM" elements and anoth- 
erone with positive ones. Both POVMs observe J]] 7r„ = 1. 



correspond to negative "POVMs". As an example consider 
Fig. |9] where two Q-functions are displayed together: Even 
though they are very similar to each other, one corresponds 
to a detector with a non-positive "POVM" element (and thus 
negative probabilities) and one corresponds to a physical 
one. Our goal is therefore to reconstruct the POVM operators 
which most closely match the data and still observe Quantum 
Mechanics (and thus are positive). . 

Since we adopt a "black box" approach we need not 
assume any of the properties described in the previous 
theoretical models. Only the accessible parts of the "black 
box" i.e., number of outcomes and control (or lack of control) 
of the phase will determine the description of our detector. 
The lack of a phase reference simplifies the experimental 
procedure, allowing us to solely control the magnitude of 
a (as has been done for tomography of a single photon 
ll65l ). A detector with no observed phase dependence will be 
described by POVM elements diagonal in the number basis. 



We now turn to the tomographic reconstruction. To charac- 
terize our detector we have measured the outcome probability 
distributions resulting from sending a tomographically com- 
plete set of input states (or probe states). The use of pure co- 
herent states as probe states implies that the probability distri- 
bution is proportional to the Q-function, as seen in Eq. (|3]l. In 
principle the knowledge of the Q-function is then sufficient to 
predict the measurement probabilities for any incoming state 
since Q{a,a*) — {a\p\a) determine p completely. 

However, a more useful and natural representation for 
photodetectors is the POVM expanded in the Fock basis. 
Another argument to find the POVM elements is that due 
to statistical noise, the Q-function could correspond to a 
non-physical POVM. Indeed, if we simply make a fit to 
the noisy measured probability distribution, this fit could 



fc=0 



\k){k\, 



simplifying hence the reconstruction of 7r„. 

Again we can express the relationship between the statistics 
and our diagonal 7r„ as, 

P = FU, 

if we measure D different values of a,ai, ... ,aD, and trun- 
cate the number states at a sufficiently large AI. For an A^- 
outcome detector, the matrices will have dimensions P E 

(^DxN^ p ^ (^DxM^ ^jjj jj g d^MxJV^ addition 
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FIG. 10: The measured probabilities for different "number-of-clicks" are shown (red dots) as a function of lal'^ = (n). The plot shows the 
statistics for the time multiplexed detector (TMD) with 9 time-bins. The statistical error vertically is too small to be seen and the jitter of jap 
was estimated to be 2% of its value. An additional 5% systematic error in the calibration of the power meter is present but can be absorbed as 
loss. From the reconstructed POVM elements {7r„} we generate the corresponding probability distributions tr(pa°''7r„) (blue curves). These 
are generated for pure j a) (aj or mixed pi'"' and for 7r„ reconstructed with the filter function or without it. For all these options, the probability 
distributions (blue lines) are so similar that they are indistinguishable on this scale. 



can easily be rewritten when the input state is a mixed state. 
This was done indeed to account for the laser's technical noise 
(as we will see in the next section) but gave similar results. For 
such a detector, the physical POVM consistent with the data 
can be estimated through the following optimisation problem: 



min{||P-Fn||2 + g(n)}, 



N 



subject to 7r„ > 0, ^ 7r„ 



(12) 



where the 2-norm ensures it is a convex quadratic problem. 
Note that we also allow for convex quadratic filter functions 
g which will be discussed in some detail later. These g are 
related to the conditioning of the problem and must not de- 
pend on the type of detector. For example, no symmetry or 
knowledge of the typical POVM structures in photo-detection 
can be assumed. If any, only general regularization functions 
that work for any POVM should be chosen. Now, for suitable 
filter functions (i.e. cuadratic) the whole problem is a con- 
vex quadratic optimisation problem, and hence also a semi- 
definite problem (SDP) which can be efficiently solved nu- 
merically |18|. Moreover, in this case, there exists a dual op- 
timisation problem whose solution coincides with the original 
problem. Thus, the dual problem provides a certificate of opti- 
mality since it provides a lower bound to the primal problem. 
Care has to be taken that the optimisation problem is well 



conditioned in order to find the true POVM of the detector. In 
finding their number basis representation we are deconvolv- 
ing a coherent state from our statistics which is intrinsically 
badly conditioned due to the importance of the wings of the 
Gaussian. Similar issues of conditioning have been discussed 
in the context of state and process tomography, see, e.g., Refs. 
P3ll60l . Due to a large ratio between the largest and smallest 
singular values of the matrices defining the quadratic problem, 
small fluctuations in the probability distribution can result in 
large variations for the reconstructed POVM. This can result 
in operators that approximate really well the outcome statis- 
tics and yet do not exhibit a smooth distribution in photon- 
number. We will discuss how to treat this problem in the next 
section. 

The measured probabilities for each outcome as a function 
of jap are displayed in Fig. 11 and Fig. 10 The probabil- 



ity distributions (equivalent modulo 1 /tt to the Q-function of 
the detector) show smooth profiles and distinct photon num- 
ber ranges of sensitivity for increasing number of clicks in 
the detector Fig. 12 shows the POVMs that result from the 



optimisation in Eq. (12i which we will discuss later. A first 



remarkable property is that 7r„, being the POVM for n clicks, 
shows zero amplitude for detecting less than n photons. That 
is, the detector shows essentially no dark counts. It should be 
noted that this was not assumed at the outset and is purely the 
result of the optimization. This sharp feature gives the detec- 
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FIG. 11: The measured probabilities for the "click" and "no-click" 
envents in the Avalanche Photodiode (APD) are shown as a function 
of |ap = (n). The statistical error vertically is too small to be seen 
and the jitter of \a\^ was estimated to be 2% of its value. An addi- 
tional 5% systematic error in the calibration of the power meter is 
present but can be absorbed as loss. From the reconstructed POVM 
elements {7r„} we generate the corresponding probability distribu- 
tions tr(pa"'7r„) (blue curves). These are generated for pure |Q!)(a| 
or mixed p^"'' and for 7r„ reconstructed with the filter function or 
without it. For all these options, the probability distributions (blue 
lines) are so similar that they are indistinguishable on this scale. 



tor its discriminatory power where n clicks means at least n 
photons in the input pulse. 

To assess the performance of our method we compare it to 
the model described in the previous section. This time how- 
ever, the BS used in the model are not 50/50 but its reflec- 
tivities (R = [0.5018, 0.5060, 0.4192]) were measured exper- 
imentally. This was done measuring the reflected and trans- 
mitted beams of a laser with a calibrated power meter The 
yellow bars in Fig. 
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show the absolute value of the dif- 
ference between the theoretical and the reconstructed POVM 
elements. The magnitude 



(n.thco) 



g(n,roc)| 



is shown stacked on top of each coefficient of the POVM ele- 
ments where (theo) stands for theoretical and (rec) for recon- 
structed. The small yellow bars reveal a good agreement with 
the model. We also calculate a form of fidelity finding that 



F = Tr 



(thco)\| (rcc) 
/ n 



(thoo) ^ i 



> 



.7% 



holds for all n. Note that to calculate F we normalized the 
POVM elements. This overlap indicates an excellent agree- 
ment between the two. 

In addition, one can reconstruct a probability distribution: 
from the found POVMs to fit the data. The reconstruction 
is plotted as dark blue bars in Fig. 10 and Fig. 11 It is the 



equivalent of the Q-function had our probe states |a) (a| with 
suitable complex a been strictly pure. In fact, although for- 
mally distinct, the probability distribution associated with the 
reconstructed POVM using mixed or pure states are practi- 



cally indistinguishable and are plotted together in Fig. 10 for 
comparison. 

Detector Wigner functions 

An alternative representation of the detectors which can 
give us more insight about their structure comes from the 
quasi-probability distributions such as the Wigner Function 
ll6n l62il . Since the POVM elements 7r„ are self adjoint 
positive-semi-definite operators, a Wigner function Wn can 
be calculated in the standard way from the POVM element 



dy{x-y\7r,,\x + y)e^'Py^\ (13) 



where we have, as usual, now identified {x,p) E as phase 
space coordinates of a single mode with a £ (D. However, 
since the POVMs do not have unit trace, this detector Wigner 
function will not be normalized, 

/oo />oo 
dx / dpWn{x,p) < 1. (14) 
-oo J —oo 

We should note that the marginals cannot be interpreted as 
probability distributions but we can still use to calculate 
probabilities according to 

/oo /-oo 
dx I dpWp{x,p)Wn{x,p). 
-oo J — oo 

(15) 

Since none of the detectors have phase sensitivity their 
Wigner functions are rotationally symmetric around the ori- 



gin. In Fig. 13 we display a cut of the TMD Wigner function 



for the following POVM elements: 

{tTq, TTl, 7r2, TTa, 7r4, TTs}. 

Higher clicks are not displayed because their amplitude is too 
small to be compared with the rest. The interesting feature 
about the plot is the comparison with the theoretical TMD 
Wigner functions one can generate with the model. Indeed, 
comparing a theoretical loss-less TMD with the measured one 
we see how the amplitude of the Wigner function decreases 
rapidly for higher photon numbers. On the other hand, com- 
parison with the lossy theoretical model reveals a good agree- 
ment. 



ILL CONDITIONING AND REGULARISATION 

One of the main problems encountered in the tomographic 
characterisation of the detectors has to do with the numerical 
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FIG. 12: Reconstructed POVMs for (a) the photon-number resolving TMD and (b) the APD "yes/no" detector. TMD POVMs were obtained 
up to element 1 60) (60| (therefore M = 60), but are shown up to 1 30) (30j for display purposes. APD POVMs are shown in full. Stacked on top 
of each 6'|"' where n is the number of clicks we show jgi"''^'^'^) _ gi^('h<=o) j ygHow. "rec" stands for reconstructed and (theo) is the theoretical 
POVM expected from (a) a TMD modelled with 3 beam splitters of measured reflectivities and 52.1% overall loss (b) a theoretical APD with 
43.2% loss respectively. Note that this result was obtained with a regularized optimisation as explained in next section . 
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FIG. 13: Wigner function of the first POVM elements of the TMD. Since the detectors have no phase reference, their Wigner functions are 
rotationally symmetric with respect to their center and a cut contains all the information. The dotted blue curve represents the Wigner function 
of the reconstructed POVMs from to 5 clicks. In red we can see the theoretical Wigner function for: (a) a theoretical TMD with 52% loss, 
(b) a theoretical TMD without loss. Paying attention to the scale we observe how dramatic the effect of loss is at damping the ripples in the 
Wigner function. It is also worth noting that the end ripples in (a) for the "5 click" are just an edge effect due to number state cutoff. 
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stability of the reconstruction. Such problems are common in 
tomography 1431 1601 . Consider for example the transforma- 
tions involved in the inverse Radon transform and their inher- 
ent instabilities. Note also how going from the Q-function to 
the P-function is not always well defined |61 1. Multiple tools 
exist to bridge the link between homodyne tomography and 
the density matrix description |66|. One of them involves the 
use of pattern functions Ii58i (64* ,68J. That is, finding some 
functions Gfc(a) such that 

J Q'~'^\a)Gk{a)cfa^e''i^\ 

However, finding the appropriate Gk involves the irregu- 
lar wave functions (particular unbounded solutions of the 
Schrodinger equation) and proving them to be appropriate is 
typically as hard as estimating the error |'63l. The use of max- 
imum likelihood has also been explored and particularly for 
detector tomography lfT4l ITSi . However, the speed of the con- 
vergence is not generally guaranteed to be high, becoming ex- 
ponential for certain problems. Our approach, following the 
spirit of maximum-likelihood, translates the problem into a 
quadratic optimisation one allowing for efficient semi-definite 
programming (SDP) (cf. Eq. ([12])). We discuss here the de- 
tails, approximations and filters that lead to our solution of 
the problem. 

Truncating the Hilbert space 

The data was measured up to jap = 150 but was truncated 
at lower values in phase space. This was done to avoid noisy 
behavior and the emergence of new regimes in the behavior of 
the detector. Memory effects requiring a larger POVM space 
were thus avoided as discussed in the experimental section. 
Notably, the effects related to the detector's dead time, after- 
pulsing or the dark counts from possible over heating were 
avoided staying in a low (|ap ~ 70) photon number regime. 

Pure vs. mixed 
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FIG. 14: POVM reconstruction, using only minimisation from Eq. 
(It}. Dark blue: ttq = Xlilo lighter blue, vn = 

E,'ioer'N)(^Utc. 
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The Q-function of our detector (directly measured) is pro- 
portional to 



Pa.n = Tr (|Q!)(a|7r„) 



(16) 



From Eq. (jSjl and Eq. (j6]l, for a diagonal POVM, we can 
write the problem as 



lp-Fn|| 



(17) 



with the usual constraints 7r„ > and '"'n — Using 
a semi-definite solver such as Yalmip, the obtained POVMs 
{iTn} shows irregular dips and a structure quite dissimilar 



from what a TMD is expected to do. The Fig. 14 shows a 



typical result, and Fig. 15 shows it for higher photon numbers 
revealing an even more irregular structure. 



FIG. 15: POVM reconstruction, using only minimisation from Eq. 
jn}. Dark blue: ttq = X^Jlo I'ght'^r blue, vn = 

X^i^^o displayed up to number layer 30. 



Describing the laser's amplitude uncertainty 

A first meaningful observation is that some level of uncer- 
tainty existed in the amplitude x = jap of the coherent states. 
If D values of x were measured then the real 

X = (xi,X2, , . . . , .,Xd) 

might actually have been 

xs = (xi(l + (5i),x2(l + (52),...,a;D(l + (5_D)), 
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FIG. 16: POVM reconstruction, using direct averaging (150 runs 
with 1% Gaussian noise). 



FIG. 17: POVM reconstruction, using direct averaging (300 runs 
with 2% Gaussian noise). 



with some vector of errors {Si, . . . ,Sd)- To address the ef- 
fect of this uncertainty on our minimisation we can artificially 
introduce noise and then average over many runs of the opti- 
misation. In other words, since F in Eq. ([TTJi depends on the 
measured values of |ap, we can substitute x with xg where 
S = {Si, . . . ,Sd) are independent and identically distributed 
random variables. Using xs we run the optimisation and ob- 
tain a family of estimated POVM elements (each element of 
the family corresponds to a run of the optimisation with a dif- 
ferent S) As a first approximation we may use a Gaussian 
probability distribution with zero mean and a — 2% jap. 
Note that 2% was the measured variance of the laser ampli- 
tude from pulse to pulse as shown in Fig. 18 Subsequently 



we average over the POVMs obtained with different "jitters" 
^ in realizations, obtaining 



(average) _ (i5j) 



/N. 



200 iterations of the optimisation with subsequent averaging 
improves the appearance of the POVMs but barely solves the 
"dips" observed. Fig.[T6]and Fig.[T7]are an example of this ap- 
proach, showing that this kind of averaging does not properly 
counteract the fluctuations in the reconstructed POVM. 



Using mixed input states 

A key obsevation showing that the previous approach is not 
the appropriate treatment of uncertainty in x is that each probe 



state would be best described by a mixture of coherent states. 



(18) 



oo 



(19) 



Here, would be some distribution centered around x in 
phase space, leading to a mixed Gaussian state in case of a 
Gaussian classical probability distribution. We can integrate 
this state px over the complex phase since we have no phase 
reference available and focus solely on the amplitude of the 
coherent states or mixtures thereof. Measurements reveal that 
the intensity of the laser varies from pulse to pulse follow- 
ing a distribution that looks like a Lorentzian with a tail (see 



Fig. 18 1. A good approximation can however be made using a 
Gaussian distribution, leading to a Gaussian state, with stan- 
dard deviation a = 0.02\a\^, implying 



1 



<T\/2^T^/n\m}. 



■n+m —(3 



With f^{l3) = e-(/''-^)'/(2<T')^ The detection probability for 
outcome n is then 



Pi 



(n) 



(20) 



fe=0 



To simplify these calculations we can write a distribution in 

JIl ~ lal. 



with 



5„(/3) = e-(^-)V(2r=). 



(21) 
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FIG. 18: Measurement of the laser's amplitude variations from pulse 
to pulse. 



In this case F has been chosen such that the approximation 
fx{P) — ga{(3) holds. These subtleties however do barely 
alter our results and POVMs are as irregular as previously. 

To evaluate the difference introduced by the pure (|a)(a|) 
or mixed state {p\a\) approach we have studied their influence 
on the reconstructed POVMs. In the regularised optimisation 
(i.e., for our final results), we have compared the POVMs ob- 
tained with each description finding that 



|npurc nmixcd||2 



in 



< 0.7% 



(22) 



mixed 1 1 2 



and the largest relative difference between any two Of, com- 
ing from a mixed state or a pure state derivation was 1.3%. 
Furthermore, the reconstructed probability distributions are so 
close that they are indistinguishable on the scale of Fig. 10 
This reinforces our earlier expectation that technical noise in 
the laser will be negligible when using single-photon-level co- 
herent states. This differs from homodyne tomography where 
technical noise can shift a strong local oscillator to a nearly 
orthogonal state. 

However, since the problem of the irregular POVMs is not 
solved by the mixed state description we need to look further 
into the origin of these iiTegularities. One first remarkable 
(but expected) property is that large variations in the photon 
number degree of freedom of the POVMs result in minuscule 
differences in the probability distributions (see Fig.[T2]i. Since 
one convolutes the photon number distribution with a Gaus- 
sian in a to obtain the Q-function this behavior has been ex- 
pected. Conversely this means that small eiTors or statistical 
fluctuations in the Q-function can result in large errors in the 
POVM elements. Consider for example that if instead of 



min||P-Fn||2 



the SDP solver finds no sensible solution. This is because 
using the Moore-Penrose pseudo-inverse we find F~^F ^ 1 
due to its inherent ill conditioning, meaning that the ratio of 
largest and smallest singular values in F is large. 

Various methods exist to try and regularise these problems. 
Whatever the chosen method it should assume as little knowl- 
edge as possible about the specific form of the sought POVM. 
Since F has very small values for high photon numbers one 
could enhance those values while preserving the minimisation 
target. For example we could run the optimisation 

min||FL>-FnL>||2 

where Z? is a diagonal matrix aimed at regularising the prob- 
lem. This can be shown to introduce some improvement but 
does not solve the "dip problem" completely. It is also hard 
to find the exact form of D that yields "good" results without 
any prior knowledge about the expected POVMs. In addition 
(roughly speaking) it is hard to find a balance between hav- 
ing good results for low photon numbers and for high photon 
numbers. 

reconstructed PO^VM (+ damping regularization) 
6 (k,n) 




Fock Layer (k]^ -10 



number of clicks (n) 



we try to minimise 



FIG. 19: Minimisation using damping method on Eq. j23| l. Note that 
the point of view is opposite that of the previous plots. We see some 
dips around the 5-th and 7-th number layer. 

Another approach is to introduce a sort of damping or spe- 
cific penalisation. For example one could define a diagonal 
matrix M such that 

and use it to redistribute the weight of each POVM element, 
avoiding unreasonably large POVM element amplitudes (that 
compensate for low values in F). The optimisation could be 
recast as. 



\F-'-p -m 



min{||P - F\l\\2 + O.Oap/nlla}. 



(23) 
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A result of this can be seen on Fig. 19 This method has the 
same shortcomings as the previous one: it is sensitive to the 
choice of parameters and the exact form of AI is hard to de- 
termine without detailed prior assumptions. 

A more reasonable method is to capture the relative 
smoothness of the POVM from a lossy detector This method 
is also called smoothing regularisation |18|. In this case one 
single assumption needs to be made. The POVMs should ex- 
hibit a certain degree of "smoothness". 



Smooth or not? 



Let us first define what we mean by smooth. Smooth will 



/)(") 



is small 



mean in this context that the difference {Of, — •j/^^ii 
for all k and n. In the optimisation context we will mean that 
our minimisation is defined as follows: 



with 



min{||P-i^n||2 + Z/5} 



(24) 



Sensitivity to noise of the Optimization 
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FIG. 20: Illustration of the sensitivity to noise for two different min- 
imisation methods (i/ = corresponds to no regularisation, and 
y 7^ to an approach using a smoothing regularisation). For each 
value of y and & we have run the optimisation 4 times and displayed 
the results here to illustrate this variation. 



for some fixed value of y. The smoothing function S will 
be independent of the detector, and will mildly penalize non- 
smooth POVM elements. This approach is further substanti- 
ated by the observation that the resulting POVMs are largely 
independent of the weight y that is given to the smoothness 
penalty. 

As most quantum detectors, especially those disucussed 
here are lossy, this is a particularly plausible feature. Indeed, 
if an optical detector has a POVM element with non-zero am- 
plitude in |n) (ri|, then if it is lossy, it will have a positive am- 
plitude in |?i + 2)(n + 2|, . . . , \n + K){n + K\, 
decreasing with K but different from zero. In fact, in general, 
if the detector has a finite efficiency 77 which can be modelled 
with a BS, it will impose some smoothness on the distribution 
(f^^ ■ That is because if G'(fc) is the probability of registering 
k photons and H{k') is the probability that k' were present, 
then the loss process will impose ll67l : 



Consequently, if 9k 7^ 0, then 9k+i, Ok+2 etc. cannot be zero, 
but will have some relatively smooth distribution. This simple 
physical argument makes a certain smoothness plausible (but 
still should allow sharp transitions for m < n). 

For this detector (and for any photodiode based detector) 
assuming loss is reasonable and can make the "smoothness" 
requirement appropriate. Let us however see if, without look- 
ing at the specific shape of our POVM, we can find an opti- 
mal smoothing coefficient y and justify further the use of the 
smoothing regularisation. 

One way to test this method is to quantify how resilient it is 
to noise in the data. To do so we introduce additional noise in 



X — |ap to the measured data. For example, we can alter x 
in Pi n = P{xi{l + Si),n) where S — {61, ... , S]j) is again a 
vector of random variables distributed with a Gaussian distri- 
bution with zero mean. This simulates a statistical uncertainty 
in the measurement of the coherent state. To see its effect on 
the reconstruction we use the figure of merit Ijlla — n5=o||2- 
This quantity should evaluate how POVMs differ from the one 
without noise. It is seen that the additional smoothing penalty 
makes the optimisation more robust, largely independent of 
the value of y (we can multiply y hy a 100 and stay in the 
same regime). Using this smoothing regularisation with noisy 
data seems therefore a good choice. 

We have seen how smoothing makes the optimization more 
robust against noise but we should also ask how sensitive 
this optimisation is to the exact choice of y. To do so we 
may use the following procedure: compare the POVM ob- 
tained using y = 0.1 with that obtained varying y over 4 
orders of magnitude. In Fig. 

ror iooHnj,-ny=o.i|/|ny=o^ 

value of y results in an overall relative error in the POVM 
of less than 1%. Multiplying (or dividing) y by 10 gives a 
variation below 5% and 100-fold variation results in a 12% 
variation. If we compare how this differs from the y ~ case 
which is 110% different then we can conclude that the optimi- 
sation is quite insensitive to the exact choice of the smoothing 
parameter y. The following table provides some values for 
reference. 



21 we plot the relative er- 
Remarkably doubling the 
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0.2 0.4 0.6 0.8 

y 



3. A perfect photon number detector, that is with 7r„ = 

\n)(n\. 

4. An artificial POVM with sharp variations, containing 
the POVM elements: 

^0 = |0)(0| + |2)(2| 

7T, = |1)(1| + 1/2|3)(3| 

t:, = 1/2|3)(3| + |4)(4| + |5)(5| 

^3 = |7)(7| 

^4 = 1/4|6)(6| + 1/4|8)(8| 

= 1/4|6)(6| + 1/4|8)(8| 

TTe = 1/2|6)(6| 

^7 = 1/2|7)(7| 



TTs = l/2|8)(8| + £|fc)(fc| 



fc=9 



and observing 



FIG. 21: Illustration of how sensitive the optimisation is to the spe- 
cific choice of y. This plot shows the relative error with respect to the 
POVM elements obtained using y = 0.1, as a function of y. In red, 
and only for reference (since it does not change with y), the value of 
the relative error for y = (no smoothing) is shown. We vary y in 
the range [0.001, 1]. It is remarkable that a 10000% variation in y 
results in only a 12% variation. For y G [0.05, 0.2] the relative error 
is less than 2% in 11. 



y 


y variation 


n relative error 


0.0001 


x/1000 


27.3% 


0.001 


x/100 


12.2% 


0.01 


x/10 


4% 


0.05 


x/2 


1% 


0.5 


x5 


3% 


1 


X 10 


5% 



TABLE I: This table illustrates how sensitive the optimisation is to 
the particular choice of y (the reference smoothing strenght is y = 
0.1) 



To study the smoothing we generate the POVM elements 
{tTu} numerically, build a probability distribution Tr (pQ7r„) 
and retrieve the 7r„ using the optimisation from Eq. (|24]) for 
an increasing range of y. Then we compare these results with 
the theoretical POVMs we defined in order to generate the 
PD. All optimisations are done using the mixed-state approach 
from Eq. (21 1. Broadly speaking, we find two distinct behav- 



iors: POVMs with terms that decay slowly in photon number 
need regularisation and are quite insensitive to the precise y. 
For sharp POVMs (without loss) the range < y < 0.01 
preserves their shape quite well, but further smoothing hides 
their true shape. These properties are further illustrated in the 
figures that follow. 



Lossy TMD 



Sharp and smooth 



There is of course a limit to how much we can penalize 
non-smooth POVMs. Is it possible for the smoothing regular- 
isation to wash out all the sharp features of the POVM, thus 
smoothing in excess? This of course is a legitimate question 
that further restricts the reasonable range for y. To study that 
effect we analyse four cases: 

1. A theoretical loss-less TMD, based on the model de- 
scribed in Eq. ( [TT] i. 

2. A lossy TMD, based on the above with added loss from 
an i? = 52% BS. 



Fig. 23 presents the evolution of the "4 click" POVM ele- 



ment as we add more smoothing (or increase y in Eq. (24l). 



This element is chosen as an illustrative example but more 
details can be found in Ref. Il59l . The figure shows in blue 



Tnlo where (rec) 



,(4) 



(4) 



we 



the coefficients ^l"*-* in n^"'^'^ 

means reconstructed. In yellow, stacked on top of 
display j^^^'*''^'''^-' — |^ where theo refers to the origi- 

nal POVM we used to generate the probability distribution. 
Clearly the smoothing improves the result and the exact value 
of y is rather unimportant. A sharp feature that is preserved 
however is 9^^^ — for i < 4 proving a good agreement with 
the model. 



18 



How does smoothing affect the Optimization 
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FIG. 22: Illustration of how too much smoothing can fail to capture the sharp variations of a POVM. We define Ilthco as the matrix containing 
the POVM elements of the theoretical POVMs. From them we generate a probability distribution and reconstruct the POVMs Iloptim with the 
smoothing regularised optimisation. The dotted lines represent || Iloptim — ntheo||2 for different values of y and for a variety of POVMs (see 
following plots). The horizontal lines represent that same difference for y = and are plotted for reference. 
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FIG. 23: Smoothing evolution for a lossy TMD detector (loss=52%). We show as an example the evolution of the POVM element 714 = 



Yl'i=o '^^ increase the amount of smoothing (in y). The yellow bars display 16*. 



(4,thGo) 



Stacked on top of ef'^"'^' 



Loss-less TMD 



Fig. 24 shows also the "4 click" event and the error associ- 



ated with the reconstruction (yellow). This TMD shows in its 
distribution the finite number of bins as we described earlier 
The distribution is not as broad as that of the lossy-loopy and 
the smoothing is therefore not so effective. The raw SDP, with 
y = 0, performs quite well, and the POVM is quite insensitive 
to the smoothing, although, when given 1/2 of the weight in 



the optimisation (y 
harmful. 



0.05) the smoothing starts to become 



Perfect number detector 



Fig. 



25 shows also the "4 click" event which in this case 
|4)(4|. A very interesting feature is that the 
achieves a perfect result. This hap- 



is simply 7r4 
simple SDP with y 
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FIG. 24: Smoothing evolution for a perfect TMD detector (no loss). We show as an example the evolution of the POVM element tt4 = 
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(4,thGo) 



Q(4.rcc) I 



stacked on top of 



](4,rcc) 



1 p 

0.8 
0.6 
0.4 
0.2 



4 - clicks 



10 20 30 



1 

0.8 
0.6 
0.4 
0.2 



4 - clicks 



4 - clicks 



4 - clicks 



4 - clicks 



1 






1 


0.8 






0.8 


0.6 






0.6 


0.4 






0.4 


0.2 






0.2 



10 20 30 



10 20 30 




10 20 30 



10 20 30 



Reconstruction of 
perfect Inxnl 



Reconstruction of Reconstruction of Reconstruction of Reconstruction of 



perfect Inxnl 



Perfect Inxnl 



perfect Inxnl 



perfect Inxnl 



y=0 y=0.001 y=0.01 y=0.1 y=0.5 

(no smoothing) (slight smoothing) (smoothing) (smoothing) (strong smoothing) 



FIG. 25: Smoothing evolution for a perfect photon number detector, that is one with 7r„ — \n){n\. We show as an example the evolution of 
the POVM element m = ESo ^'f 

as we increase the amount of smoothing (in y). The yellow bars display 16*^^ ' — 9i •'^'"^'^ 

stacked on top of 



pens in spite of using a mixed state as a probe state (mixture of Sharp POVM with loss 

amplitudes |a| around |q;)(q;|). The reconstruction is then ro- 
bust for very well defined and sharp features, where the higher 
decaying coefficients do not introduce instabilities. 



Sharp POVM 

We now discuss the situation of a POVM element that is 
not related to an experiment, but has been artificially gener- 
ated to identify the limit of the smoothing regularisation. The 
element displayed in Fig. 26 is 7r4 — |7)(7| + |9)(9| and we 
can see that y = 0.1 is already too much smoothing. Cer- 
tainly to reconstruct a completely loss-less detector with such 
a structure smoothing is not an appropriate strategy. We must 
remember however that all current photon-number detectors 
that count particles do exhibit loss, and have therefore some 
degree of smoothness in them. 



The previous case could have given the impression that the 
reconstruction fails for a sharp POVM. However one has to 
stress that smoothing (or a regularization for the optimisation) 
is necessary when there is loss and an ill conditionned ma- 
trix (which is a generic case in quantum optics using coherent 
states for detector tomography). Therefore, it's worth consid- 
ering what happens when an invented POVM (as the previous 
one) is made more realistic adding some loss. Fig. |27] illus- 
trates this, showing the reconstruction of the previous sharp 
POVM element which has suffered a 20% loss. Indeed in this 
case we can see a clear improvement as the smoothing helps 
regularize the optimisation. 
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FIG. 26: Smoothing evolution for an invented POVM with sharp variations. Displayed is 714 — |7)(7| + |9)(9|. We show as an example 
the evolution of the POVM element tt4 = Ef=o 

as we increase the amount of smoothing (in y). The yellow bars display 
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FIG. 27: Smoothing evolution for an invented POVM with sharp variations which has suffered a 20% loss (modelled by interposing a beam 
splitter).. Displayed is the reconstruction of the element n4 — \7){7\ + |9){9| when it suffers the mentionned loss. The reconstruction shows 
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CONCLUSION 

As quantum information and computation implementations 
evolve, detectors are becoming more complex. In addition, 
crypto security requires a careful statement of assumptions 
idealy kept to a minimum. This, as we have seen, calls for 
a black-box characterisation of the operators they implement. 
We have seen the first implementation of this type of tomog- 
raphy. We discussed in detail the first experimental realisa- 
tion of quantum detector tomography completing the triad of 
experimental state EE] [91, process |l4l|5l[T0l[TTl, and detec- 
tor tomography ifTTll . This detector characterisation opens up 
more flexible and complex ways of detecting quantum states 
and accurately preparing non-classical light. 

The reconstruction methods are simple and efficient. How- 



ever one has to pay close attention to the subtleties behind the 
ill-conditioning of such reconstructions whether its state or 
detector tomography. Fully characterising a detector with this 
method can help get rid of complex or erroneous assumptions 
in the modelling. Furthermore, once they are fully charac- 
terised, one can re-design or alter the detectors with a direct 
feedback on their performance. 

Detector tomography significantly benefits state tomogra- 
phy or metrology, as well as state preparation and the imple- 
mentation of protocols in quantum information requiring de- 
tectors in state manipulation. Importantly, it enables the use 
of detectors that are noisy, non-linear or that operate outside 
their intended range. One conclusion is that lossy detectors 
are often just as useful as perfect ones, as long as we know the 
exact POVMs and one can describe the rest of our experiment 
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accordingly. This method will also allow the benchmarking of 
similar detectors making performance comparisons possible. 
Indeed one can also ask question concerning the power each 
detector has for preparing non-classical states. This opens a 
path for the experimental study of novel concepts such as the 
non-classicality of detectors. Another promising avenue is to 
translate homodyne tomography techniques to optical detector 
tomography. For example defining the detector tomography 
equivalents of balanced noise-reduction, direct measurement 
of the Wigner function or pattern functions). Naturally an im- 
mediate next step would involve characterizing detectors with 
off diagonal terms and phase sensitivity. 
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